103 research outputs found
Enabling Inter-Repository Access Management between iRODS and Fedora
4th International Conference on Open RepositoriesThis presentation was part of the session : Conference PresentationsDate: 2009-06-04 08:30 AM â 10:00 AMMany digital repositories have been built using different technologies such as Fedora and the integrated Rule-Oriented Data System (iRODS). This paper analyzes both the Fedora and iRODS technologies to understand how to integrate the two systems to enable cross-repository data sharing. The areas considered include the digital object model, services, management of distributed storage, external data resources, and policy enforcement.National Science Foundatio
File servers, networking, and supercomputers
One of the major tasks of a supercomputer center is managing the massive amount of data generated by application codes. A data flow analysis of the San Diego Supercomputer Center is presented that illustrates the hierarchical data buffering/caching capacity requirements and the associated I/O throughput requirements needed to sustain file service and archival storage. Usage paradigms are examined for both tightly-coupled and loosely-coupled file servers linked to the supercomputer by high-speed networks
Digital Archive Policies and Trusted Digital Repositories
The MIT Libraries, the San Diego Supercomputer Center, and the University of California San Diego Libraries are conducting the PLEDGE Project to determine the set of policies that affect operational digital preservation archives and to develop standardized means of recording and enforcing them using rules engines. This has the potential to allow for automated assessment of "trustworthiness" of digital preservation archives. We are also evaluating the completeness of other efforts to define policies for digital preservation such as the RLG/NARA Trusted Digital Repository checklist and the PREMIS metadata schema. We present our results to date.National Archives and Records Administration (US) under NSF cooperative agreement 0523307 through a supplement to SCI 0438741 and the NSF grant ITR 0427196
Digital Library and Data Grid Technology Group
Digital libraries and data grids manage state information about data collections. This is in contrast to the management of semantic information used for discovery that is provided by semantic web technology. The discussion group investigated the types of inferences and relationship management that would improve digital library and grid services. Notable examples include management of relationships discovered by data mining services, management of properties associated with grid name spaces, and management of properties associated with encoding format structure descriptions
Serverâside workflow execution using data grid technology for reproducible analyses of dataâintensive hydrologic systems
Many geoscience disciplines utilize complex computational models for advancing understanding and sustainable management of Earth systems. Executing such models and their associated data preprocessing and postprocessing routines can be challenging for a number of reasons including (1) accessing and preprocessing the large volume and variety of data required by the model, (2) postprocessing large data collections generated by the model, and (3) orchestrating data processing tools, each with unique software dependencies, into workflows that can be easily reproduced and reused. To address these challenges, the work reported in this paper leverages the Workflow Structured Object functionality of the Integrated RuleâOriented Data System and demonstrates how it can be used to access distributed data, encapsulate hydrologic data processing as workflows, and federate with other communityâdriven cyberinfrastructure systems. The approach is demonstrated for a study investigating the impact of drought on populations in the Carolinas region of the United States. The analysis leverages computational modeling along with data from the Terra Populus project and data management and publication services provided by the Sustainable EnvironmentâActionable Data project. The work is part of a larger effort under the DataNet Federation Consortium project that aims to demonstrate data and computational interoperability across cyberinfrastructure developed independently by scientific communities.Plain Language SummaryExecuting computational workflows in the geosciences can be challenging, especially when dealing with large, distributed, and heterogeneous data sets and computational tools. We present a methodology for addressing this challenge using the Integrated RuleâOriented Data System (iRODS) Workflow Structured Object (WSO). We demonstrate the approach through an endâtoâend application of data access, processing, and publication of digital assets for a scientific study analyzing drought in the Carolinas region of the United States.Key PointsReproducibility of dataâintensive analyses remains a significant challengeData grids are useful for reproducibility of workflows requiring large, distributed data setsData and computations should be coâlocated on servers to create executable WebâresourcesPeer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/137520/1/ess271_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/137520/2/ess271.pd
Global urban environmental change drives adaptation in white clover
Urbanization transforms environments in ways that alter biological evolution. We examined whether urban environmental change drives parallel evolution by sampling 110,019 white clover plants from 6169 populations in 160 cities globally. Plants were assayed for a Mendelian antiherbivore defense that also affects tolerance to abiotic stressors. Urban-rural gradients were associated with the evolution of clines in defense in 47% of cities throughout the world. Variation in the strength of clines was explained by environmental changes in drought stress and vegetation cover that varied among cities. Sequencing 2074 genomes from 26 cities revealed that the evolution of urban-rural clines was best explained by adaptive evolution, but the degree of parallel adaptation varied among cities. Our results demonstrate that urbanization leads to adaptation at a global scale
Prototype Preservation Environments
The Persistent Archive Testbed and National Archives and Records
Administration (NARA) research prototype persistent archive are
examples of preservation environments. Both projects are using
data grids to implement data management infrastructure that can
manage technology evolution. Data grids are software systems that
provide persistent names to digital entities, manage data that are
distributed across multiple types of storage systems, and provide support
for preservation metadata. A persistent archive federates multiple
data grids to provide the fault tolerance and disaster recovery
mechanisms essential for long-term preservation. The capabilities
of the prototype persistent archives will be presented, along with
examples of how the capabilities are used to support the preservation
of email, Web crawls, office products, image collections, and
electronic records. (from the article)published or submitted for publicatio
- âŠ